Skip to content

Plan Phase 3 Space background workers#171

Open
akseljoonas wants to merge 2 commits intomainfrom
plan-space-background-workers
Open

Plan Phase 3 Space background workers#171
akseljoonas wants to merge 2 commits intomainfrom
plan-space-background-workers

Conversation

@akseljoonas
Copy link
Copy Markdown
Collaborator

@akseljoonas akseljoonas commented Apr 28, 2026

Summary

  • document the Phase 3 background-worker architecture for Space sessions
  • add Mongo-backed durable session_runs queue methods, indexes, claim/heartbeat/finish/interrupt-expired lifecycle
  • add opt-in background /api/chat path behind ML_INTERN_BACKGROUND_WORKERS
  • add Mongo-polled SSE replay for worker-owned runs
  • add worker runtime plus worker Space health app (ML_INTERN_PROCESS_ROLE=worker)
  • keep current direct execution path as default rollback behavior

Rollout notes

  • default remains unchanged until ML_INTERN_BACKGROUND_WORKERS=true
  • safe first rollout is in-process worker: ML_INTERN_BACKGROUND_WORKERS=true + ML_INTERN_RUN_WORKER_IN_PROCESS=true
  • separate worker Space entrypoint exists, but user-scoped HF tool execution still needs a token handoff/token-broker design before production use

Testing

  • UV_CACHE_DIR=/tmp/uv-cache uv run --extra dev pytest tests/unit/test_session_persistence.py tests/unit/test_background_worker.py tests/unit/test_session_manager_persistence.py tests/unit/test_user_quotas.py -q
  • UV_CACHE_DIR=/tmp/uv-cache uv run python -m py_compile agent/core/session_persistence.py backend/background_worker.py backend/worker_app.py backend/main.py backend/routes/agent.py backend/session_manager.py
  • git diff --check
  • import smoke for background_worker, worker_app, and main

Document Phase 3 as a durable Mongo run queue plus a shared Hugging Face worker Space, so the next implementation can preserve browser reconnect semantics without binding agent execution to SSE lifetimes.

Constraint: Mongo session persistence must be healthy before background workers can own execution.

Rejected: One worker Space per session | shared workers are simpler to operate and Mongo leases already provide coordination.

Confidence: high

Scope-risk: narrow

Directive: Keep browser/SSE as an observer; durable session_runs and workers should own execution in Phase 3.

Tested: git diff --check

Not-tested: Documentation-only change; no runtime tests run.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Add the Phase 3 foundation: Mongo-backed session_runs, an opt-in /api/chat enqueue path, Mongo-polled SSE replay, and a worker runtime that reuses the existing SessionManager and agent loop. The default remains the existing direct execution path until ML_INTERN_BACKGROUND_WORKERS is enabled.

Constraint: External worker Spaces cannot safely execute user-scoped HF operations until token handoff is designed; in-process workers are the safe first rollout.

Rejected: Persist raw user OAuth tokens in Mongo | broadens the trust boundary and should not be the default handoff mechanism.

Rejected: Replay expired running tool calls | non-idempotent tools can duplicate side effects, so expired runs are marked interrupted.

Confidence: medium

Scope-risk: broad

Directive: Keep ML_INTERN_BACKGROUND_WORKERS disabled until prod Mongo is healthy and the worker mode is smoke-tested.

Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --extra dev pytest tests/unit/test_session_persistence.py tests/unit/test_background_worker.py tests/unit/test_session_manager_persistence.py tests/unit/test_user_quotas.py -q

Tested: UV_CACHE_DIR=/tmp/uv-cache uv run python -m py_compile agent/core/session_persistence.py backend/background_worker.py backend/worker_app.py backend/main.py backend/routes/agent.py backend/session_manager.py

Tested: git diff --check

Tested: import smoke for background_worker, worker_app, and main
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

@fglogan
Copy link
Copy Markdown

fglogan commented May 3, 2026

closed per maintainer request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants